252 research outputs found
The Lazy Bootstrap. A Fast Resampling Method for Evaluating Latent Class Model Fit
The latent class model is a powerful unsupervised clustering algorithm for
categorical data. Many statistics exist to test the fit of the latent class
model. However, traditional methods to evaluate those fit statistics are not
always useful. Asymptotic distributions are not always known, and empirical
reference distributions can be very time consuming to obtain. In this paper we
propose a fast resampling scheme with which any type of model fit can be
assessed. We illustrate it here on the latent class model, but the methodology
can be applied in any situation.
The principle behind the lazy bootstrap method is to specify a statistic
which captures the characteristics of the data that a model should capture
correctly. If those characteristics in the observed data and in model-generated
data are very different we can assume that the model could not have produced
the observed data. With this method we achieve the flexibility of tests from
the Bayesian framework, while only needing maximum likelihood estimates. We
provide a step-wise algorithm with which the fit of a model can be assessed
based on the characteristics we as researcher find important. In a Monte Carlo
study we show that the method has very low type I errors, for all illustrated
statistics. Power to reject a model depended largely on the type of statistic
that was used and on sample size. We applied the method to an empirical data
set on clinical subgroups with risk of Myocardial infarction and compared the
results directly to the parametric bootstrap. The results of our method were
highly similar to those obtained by the parametric bootstrap, while the
required computations differed three orders of magnitude in favour of our
method.Comment: This is an adaptation of chapter of a PhD dissertation available at
https://pure.uvt.nl/portal/files/19030880/Kollenburg_Computer_13_11_2017.pd
Bayesian tests on components of the compound symmetry covariance matrix
Complex dependency structures are often conditionally modeled, where random effects parameters are used to specify the natural heterogeneity in the population. When interest is focused on the dependency structure, inferences can be made from a complex covariance matrix using a marginal modeling approach. In this marginal modeling framework, testing covariance parameters is not a boundary problem. Bayesian tests on covariance parameter(s) of the compound symmetry structure are proposed assuming multivariate normally distributed observations. Innovative proper prior distributions are introduced for the covariance components such that the positive definiteness of the (compound symmetry) covariance matrix is ensured. Furthermore, it is shown that the proposed priors on the covariance parameters lead to a balanced Bayes factor, in case of testing an inequality constrained hypothesis. As an illustration, the proposed Bayes factor is used for testing (non-)invariant intra-class correlations across different group types (public and Catholic schools), using the 1982 High School and Beyond survey data
The matrix-F prior for estimating and testing covariance matrices
The matrix-F distribution is presented as prior for covariance matrices as an alternative to the conjugate inverted Wishart distribution. A special case of the univariate F distribution for a variance parameter is equivalent to a half-t distribution for a standard deviation, which is becoming increasingly popular in the Bayesian literature. The matrix-F distribution can be conveniently modeled as a Wishart mixture of Wishart or inverse Wishart distributions, which allows straightforward implementation in a Gibbs sampler. By mixing the covariance matrix of a multivariate normal distribution with a matrix-F distribution, a multivariate horseshoe type prior is obtained which is useful for modeling sparse signals. Furthermore, it is shown that the intrinsic prior for testing covariance matrices in non-hierarchical models has a matrix-F distribution. This intrinsic prior is also useful for testing inequality constrained hypotheses on variances. Finally through simulation it is shown that the matrix-variate F distribution has good frequentist properties as prior for the random effects covariance matrix in generalized linear mixed models
Simple Bayesian testing of scientific expectations in linear regression models
Scientific theories can often be formulated using equality and order
constraints on the relative effects in a linear regression model. For example,
it may be expected that the effect of the first predictor is larger than the
effect of the second predictor, and the second predictor is expected to be
larger than the third predictor. The goal is then to test such expectations
against competing scientific expectations or theories. In this paper a simple
default Bayes factor test is proposed for testing multiple hypotheses with
equality and order constraints on the effects of interest. The proposed testing
criterion can be computed without requiring external prior information about
the expected effects before observing the data. The method is implemented in
R-package called `{\tt lmhyp}' which is freely downloadable and ready to use.
The usability of the method and software is illustrated using empirical
applications from the social and behavioral sciences.Comment: 33 pages, 1 figure, 2 appendice
BIC extensions for order-constrained model selection
The Schwarz or Bayesian information criterion (BIC) is one of the most widely used tools for model comparison in social science research. The BIC, however, is not suitable for evaluating models with order constraints on the parameters of interest. This article explores two extensions of the BIC for evaluating order-constrained models, one where a truncated unit information prior is used under the order-constrained model and the other where a truncated local unit information prior is used. The first prior is centered on the maximum likelihood estimate, and the latter prior is centered on a null value. Several analyses show that the order-constrained BIC based on the local unit information prior works better as an Occam’s razor for evaluating order-constrained models and results in lower error probabilities. The methodology based on the local unit information prior is implemented in the R package “BICpack” which allows researchers to easily apply the method for order-constrained model selection. The usefulness of the methodology is illustrated using data from the European Values Study
Bayesian multilevel multivariate logistic regression for superiority decision-making under observable treatment heterogeneity
In social, medical, and behavioral research we often encounter datasets with a multilevel structure and multiple correlated dependent variables. These data are frequently collected from a study population that distinguishes several subpopulations with different (i.e. heterogeneous) effects of an intervention. Despite the frequent occurrence of such data, methods to analyze them are less common and researchers often resort to either ignoring the multilevel and/or heterogeneous structure, analyzing only a single dependent variable, or a combination of these. These analysis strategies are suboptimal: Ignoring multilevel structures inflates Type I error rates, while neglecting the multivariate or heterogeneous structure masks detailed insights. To analyze such data comprehensively, the current paper presents a novel Bayesian multilevel multivariate logistic regression model. The clustered structure of multilevel data is taken into account, such that posterior inferences can be made with accurate error rates. Further, the model shares information between different subpopulations in the estimation of average and conditional average multivariate treatment effects. To facilitate interpretation, multivariate logistic regression parameters are transformed to posterior success probabilities and differences between them. A numerical evaluation compared our framework to less comprehensive alternatives and highlighted the need to model the multilevel structure: Treatment comparisons based on the multilevel model had targeted Type I error rates, while single-level alternatives resulted in inflated Type I errors. A re-analysis of the Third International Stroke Trial data illustrated how incorporating a multilevel structure, assessing treatment heterogeneity, and combining dependent variables contributed to an in-depth understanding of treatment effects
- …